Structural Regression Trees 1
نویسنده
چکیده
In many real-world domains the task of machine learning algorithms is to learn a theory predicting numerical values. In particular several standard test domains used in Inductive Logic Programming (ILP) are concerned with predicting numerical values from examples and relational and mostly non-determinate background knowledge. However, so far no ILP algorithm except one can predict numbers and cope with non-determinate background knowledge. (The only exception is a covering algorithm called FORS.) In this paper we present Structural Regression Trees (SRT), a new algorithm which can be applied to the above class of problems by integrating the statistical method of regression trees into ILP. SRT constructs a tree containing a literal (an atomic formula or its negation) or a conjunction of literals in each node, and assigns a numerical value to each leaf. SRT provides more comprehensible results than purely statistical methods, and can be applied to a class of problems most other ILP systems cannot handle. Experiments in several real-world domains demonstrate that the approach is competitive with existing methods, indicating that the advantages are not at the expense of predictive accuracy.
منابع مشابه
Predicting The Type of Malaria Using Classification and Regression Decision Trees
Predicting The Type of Malaria Using Classification and Regression Decision Trees Maryam Ashoori1 *, Fatemeh Hamzavi2 1School of Technical and Engineering, Higher Educational Complex of Saravan, Saravan, Iran 2School of Agriculture, Higher Educational Complex of Saravan, Saravan, Iran Abstract Background: Malaria is an infectious disease infecting 200 - 300 million people annually. Environme...
متن کاملChanges of phenolic compounds and non-structural carbohydrates on alternate bearing cycle in ‘Kinnow’ mandarin trees
In this study we have demonstrated the variations of carbohydrates and phenolic compounds present in the leaves and stems of ‘Kinnow’ mandarin (Citrus reticulata Blanco) trees in alternate bearing cycle and the possible involvement of these compounds to flower bud formation process. The amounts of these compounds were determined in the leaves and stems of “on” and “off” trees monthly from Nov...
متن کاملLearning GP-trees from Noisy Data
We discuss the problem of model selection in Genetic Programming using the framework provided by Statistical Learning Theory, i.e. Vapnik-Chervonenkis theory (VC). We present empirical comparisons between classical statistical methods (AIC, BIC) for model selection and the Structural Risk Minimization method (based on VC-theory) for symbolic regression problems. Empirical comparisons of differe...
متن کاملFactors Influencing Drug Injection History among Prisoners: A Comparison between Classification and Regression Trees and Logistic Regression Analysis
Background: Due to the importance of medical studies, researchers of this field should be familiar with various types of statistical analyses to select the most appropriate method based on the characteristics of their data sets. Classification and regression trees (CARTs) can be as complementary to regression models. We compared the performance of a logistic regression model and a CART in predi...
متن کاملDetecting Multiple Mean Breaks At Unknown Points With Atheoretical Regression Trees
In this paper we propose a computationally effective approach to detect multiple structural breaks in the mean occurring at unknown dates. We propose a non-parametric approach that exploits, in the framework of least squares regression trees, the contiguity property of the Fisher grouping method (1958) proposed for grouping a single real variable. The proposed approach is applied to study the p...
متن کامل